6  Comparison of macroinvertebrate indices

We can analyze this same data set with our second research objective in mind:

Research Objective 1:

Compare stream habitat quality for selected headwater streams across spring, summer, and fall based on macroinvertebrate as bioindicators indices.

We are going to use to bioindices to assess stream health. An index is a measure that allows us to track overall trends of something - in this case stream health. It is a way to take something very complex like stream health which is comprised of physical, biological, and chemical aspects and compare them with one easily to estimate metric. Instead of measuring each individual parameter individually we are assessing the compounding impacts on a component of the biological community. We can do this because different taxonomic groups have different tolerance levels to survive under stressful conditions. Here, we will use tolerance values assigned to an entire taxonomic group (order or family), however, we should keep in mind that the tolerance of species within that group might vary.

Let’s load our samples and create the combined data set:

# read libraries
library(readr)
library(janitor)
library(lubridate)
library(tidyr)
library(dplyr)
library(glue)
library(ggplot2)
library(patchwork)
library(knitr)

rand_2210 <- read_delim("data/macros_RAND_2022-10-19.tsv", delim = "\t") %>%
  clean_names()

rand_2305 <- read_delim("data/macros_RAND_2023-05-30.tsv", delim = "\t") %>%
  clean_names()

rand_2307 <- read_delim("data/macros_RAND_2023-07-07.tsv", delim = "\t") %>%
  clean_names()

rand_2310 <- read_delim("data/macros_RAND_2023-10-18.tsv", delim = "\t") %>%
  clean_names()

rand_2407 <- read_delim("data/macros_RAND_2024-07-09.tsv", delim = "\t") %>%
  clean_names()

rand_2410 <- read_delim("data/macros_RAND_2024-10-02.tsv", delim = "\t") %>%
  clean_names()

schb_2210 <- read_delim("data/macros_SCHB_2022-11-02.tsv", delim = "\t") %>%
  clean_names()

schb_2305 <- read_delim("data/macros_SCHB_2023-05-30.tsv", delim = "\t") %>%
  clean_names()

schb_2410 <- read_delim("data/macros_SCHB_2024-10-23.tsv", delim = "\t") %>%
  clean_names()

macros <- bind_rows(rand_2210,
                    rand_2305,
                    rand_2307,
                    rand_2310,
                    rand_2407,
                    rand_2410,
                    schb_2210,
                    schb_2305,
                    schb_2410)

6.0.1 EPT index

The first first index is the EPT index. Three major orders with low tolerance to pollution are Ephemeroptera (mayflies), Plectoptera (stoneflies), Thrichoptera (caddisflies) are sensitive to pollutants in their environment.

In short, the EPT index quantifies the percentage of sensitive orders to total orders found.

Consider this

Explain whether a higher or lower EPT index is indicative of better stream health

A higher EPT value indicates that a larger proportion of your sample is comprised of the three sensitive orders.

Let’s calculate the EPT index for each stream by determining the total number of EPT individuals and dividing that number by the total number of individuals in the data set.

We are going to use mutate() to add a new column that will classify specimen as EPT taxa or non-EPT taxa, then we can calculate the relative abundance.

kable( 
macros %>%
  mutate(EPT = ifelse(order %in% c("Ephemeroptera", "Plecoptera", "Trichoptera"), "EPT", "non-EPT")) %>%
  group_by(site_id, sample_date, EPT) %>%
  count() %>%
  ungroup() %>%
  group_by(site_id, sample_date) %>%
  mutate(EPT_index = n/(sum(n))) %>%
  filter(EPT == "EPT"),
digits = 3
)
site_id sample_date EPT n EPT_index
RAND 2022-10-19 EPT 35 0.507
RAND 2023-05-30 EPT 197 0.883
RAND 2023-07-07 EPT 14 0.298
RAND 2023-10-18 EPT 212 0.765
RAND 2024-07-09 EPT 25 0.581
RAND 2024-10-02 EPT 42 0.433
SCHB 2022-11-02 EPT 35 0.875
SCHB 2023-05-30 EPT 121 0.747
SCHB 2024-10-23 EPT 70 0.737
Consider this

Describe your results. Consider differences within and among samples

As always start by stating the general trends and patterns first in a way that somebody who does not necessarily have your tables in front of them has a good idea of what is being compared and what the major pattern is. Then point out notable highlights. Make sure you are specific.

Higher and lower are qualitative descriptions but there is a big difference if the EPT index is higher in one location by 1% or 10%!

Your answer should be 2-3 sentences.

6.0.2 Hilsenhoff Biotic Index

The second Index we will use is the Hilsenhoff Biotic Index.

It functions the same way as the EPT index where we distinguish between different tolerance levels for different taxonomic groups and then estimate the overall tolerance of the community weighted by the relative abundance of each group.

  • taxonomic groups are assigned a number from 0 - 10 pertaining to known sensitivity to organic pollutants (0 most sensitive, 10 most tolerant)
  • the number of individuals per group is multiplied by the tolerance number for that group.
  • the sum of all group tolerance values is divided by the total number of specimen in that sample.

Different ranges of the HBI value are indicative of stream health:

  • Excellent (0 - 3.75): organic pollution unlikely
  • Very Good (3.76 - 4.25): possible slight organic pollution
  • Good (4.26 - 5): some organic pollution probable
  • Fail (5 - 5.75): fairly substantial pollution likely
  • Fairly Poor (5.76 - 6.50): substantial pollution likely
  • Poor (6.51 - 7.25): very very substantial pollution likely
  • Very Poor (7.26 - 10): severe organic pollution likely

First, we will need to pull in a data set that contains the tolerance levels for each family. Keep in mind, that even within families there are going to be differences in terms of tolerance1.

  • 1 These are from this table, you will see that for some of the families in our data set we do not have tolerance values, we might need to update the information based on on macroinvertebrates.org.

  • tolerance <- read_delim("data/macros_family-tolerance.csv", delim = ",")

    Now we can join those two data frames and calculate the weighted average.

    kable(
      macros %>%
        left_join(tolerance) %>%
        filter(!is.na(tolerance)) %>%
        group_by(site_id, sample_date, family, tolerance) %>%
        count() %>%
        mutate(weighted_abund = n*tolerance) %>%
        ungroup() %>%
        group_by(site_id, sample_date) %>%
        summarize(HSI = sum(weighted_abund)/sum(n)),
      digits = 2)
    site_id sample_date HSI
    RAND 2022-10-19 2.88
    RAND 2023-05-30 1.97
    RAND 2023-07-07 3.49
    RAND 2023-10-18 2.34
    RAND 2024-07-09 3.05
    RAND 2024-10-02 2.79
    SCHB 2022-11-02 2.97
    SCHB 2023-05-30 2.31
    SCHB 2024-10-23 2.70
    Consider this

    Describe your results.

    Recall, that you will want to start by stating the general trends and patterns first in a way that somebody who does not necessarily have your tables in front of them has a good idea of what is being compared and what the major pattern is. Then point out notable highlights. Make sure you are specific. Higher and lower are qualitative descriptions but there is a big difference if the EPT index is higher in one location by 1% or 10%!

    Your answer should be 2-3 sentences.

    Consider this

    Interpret and discuss your results (species composition, bioindices) in an assessment of the stream health of our two headwater streams.

    Start your discussion with one sentence summarizing what you did and why. For example, “Here, we collected a representative sample of macroinvertebrates from two headwater streams to …” and summarize your key results in 2-3 sentences; make sure to highlight if anything was unexpected between the different analysis (species composition, EPT index, HSI) or if they all point toward the same thing.

    Then interpret what that means in terms of stream health. Consider expected differences, e.g. Schoolhouse Brook is a first order stream, Rand Brook is a second order stream, i.e. other streams flow into it and difference sample times or if you found unexpected differences what that means, for example these locations are very close to each other. From having observed the habitat, did you expect high or low stream health? Why? Explain why you think you got the results that you did.

    Your answer should be 250 - 500 words.